Method And Apparatus For Image Restoration, Storage Medium And Terminal

ABSTRACT

The present disclosure provides an method and apparatus for image restoration, a storage medium and a terminal. The method includes: acquiring a to-be-processed image including biometric information; inputting the to-be-processed image into a generator, wherein the generator comprises a neural network model with a plurality of convolutional layers, and a weight of a convolutional kernel of the generator is determined at least according to a quality of an image historically restored by the generator; and restoring the to-be-processed image by the generator to acquire a restored image. The present disclosure can effectively improve a restoration quality when performing image restoration based on machine learning.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims the benefit of priority to Chinese patentapplication No. 202010357097.7, filed on Apr. 29, 2020, entitled “Methodand Apparatus for Image Restoration, Storage Medium and Terminal”, theentire disclosures of which are incorporated herein by reference.

TECHNICAL FIELD

The present disclosure relates to the technical field of imageprocessing, and more particularly to a method and apparatus for imagerestoration, a storage medium and a terminal.

BACKGROUND

In existing technology, when an under-screen optical fingerprint imagingis captured, if an imaging of a fingerprint on a sensor just falls intoa signal saturation area of the sensor, texture information of thefingerprint will be lost. Further, as an incident angle of lightincreases, a transmission path of the light with fingerprint informationbefore reaching the sensor increases accordingly, which results in aweakening of an actual optical signal received by a sensor. When theincident angle increases to a certain angle, a signal-to-noise ratio ofthe optical signal received by the sensor is too small to be detected toform a clear fingerprint image.

Therefore, restoring fingerprint images captured by the sensor hasbecome a major research focus in the field of fingerprint acquisitiontechnology.

SUMMARY

The present disclosure provides an improved restoration method, whichcan improve a restored image quality when performing image restorationbased on machine learning.

An embodiment of the present disclosure provides a method for imagerestoration. The method may include: acquiring a to-be-processed image,wherein the to-be-processed image includes biometric information;inputting the to-be-processed image into a generator, wherein thegenerator includes a neural network model with a plurality ofconvolutional layers, and a weight of a convolutional kernel of thegenerator is determined at least according to a quality of an imagehistorically restored by the generator; and restoring theto-be-processed image by the generator to acquire a restored image.

In some embodiment, determining a weight of a convolutional kernel ofthe generator at least according to a quality of an image historicallyrestored by the generator includes: updating the weight of theconvolutional kernel of the generator according to an evaluation resultof a discriminator on the image historically restored by the generator.The discriminator includes a neural network model with a plurality ofconvolutional layers.

In some embodiment, there is a connection relationship among theplurality of convolutional layers of the neural network model of thegenerator and/or the plurality of convolutional layers of the neuralnetwork model of the discriminator.

In some embodiment, updating the weight of the convolutional kernel ofthe generator according to an evaluation result of a discriminator onthe image historically restored by the generator includes: acquiring theimage historically restored by the generator; inputting the historicallyrestored image into the discriminator; acquiring a first evaluationresult of the discriminator on the historically restored image; andupdating the weight of the convolutional kernel of the generator atleast according to a loss function of the generator and the firstevaluation result.

In some embodiment, the loss function of the generator includes anadversarial loss function and an L1 loss function, and updating theweight of the convolutional kernel of the generator at least accordingto the loss function of the generator and the first evaluation resultincludes: calculating a first output value according to the firstevaluation result and the adversarial loss function of the generator;calculating a second output value according to the historically restoredimage, a standard image corresponding to the historically restored imageand the L1 loss function; and updating the weight of the convolutionalkernel of the generator according to the first output value and thesecond output value.

In some embodiment, calculating a first output value according to thefirst evaluation result and the adversarial loss function of thegenerator includes: calculating the first output value based onfollowing formula:

${L_{g\_ adv} = {\max\limits_{G}\left( {E_{z \sim {p{(z)}}}\left\lbrack {D\left( {G(z)} \right)} \right\rbrack} \right)}};$

wherein L_(g_adv), represents the first output value,

$\max\limits_{G}(\mspace{14mu})$

represents a value or G when the function of E_(z˜p(z))[D(G(z))] has amaximum value, G represents the generator, the value of G calculatedbased on the above formula is the first output value; E_(z˜p(z))(u)represents a mean value of a function u when z obeys P(z), p(z)represents a distribution of the historically restored image, zrepresents the to-be-processed image, D(G(z)) represents the firstevaluation result, and G(z) represents the historically restored image.

In some embodiment, calculating a second output value according to thehistorically restored image, a standard image corresponding to thehistorically restored image and the L1 loss function includes:calculating the second output value based on following formula:

L ₁ =∥x−G(z)∥₁;

wherein L1 represents the second output value, x represents the standardimage, z represents the to-be-processed image, and G(z) represents thehistorically output restored image.

In some embodiment, updating the weight of the convolutional kernel ofthe discriminator according to an evaluation result of the discriminatoron the image historically restored by the generator and a standard imagecorresponding to the historically restored image.

In some embodiment, updating the weight of the convolutional kernel ofthe discriminator according to an evaluation result of the discriminatoron the historically restored image by the generator and a standard imagecorresponding to the historically restored image includes: acquiring theimage historically restored by the generator and the correspondingstandard image; inputting the historically restored image into thediscriminator to obtain a first evaluation result, and inputting thestandard image into the discriminator to obtain a second evaluationresult; and calculating a third output value at least according to anadversarial loss function of the discriminator, the first evaluationresult and the second evaluation result; and updating the weight of theconvolutional kernel of the discriminator according to the third outputvalue.

In some embodiment, calculating a third output value at least accordingto an adversarial loss function of the discriminator, the firstevaluation result and the second evaluation result includes: calculatingthe third output value based on following formula:

${L_{d\_ adv} = {{\min\limits_{D}\left( {{E_{x \sim {q{(x)}}}\left\lbrack {\max\left( {0,{1 - {D(x)}}} \right)} \right\rbrack} + {E_{z \sim {p{(z)}}}\left\lbrack {\max\left( {0,{1 + {D\left( {G(z)} \right)}}} \right)} \right\rbrack}} \right)} + {\lambda{E_{\hat{I} \sim {p\hat{I}}}\left( {\left( {{\nabla_{\hat{I}}{D\left( \hat{I} \right)}}} \right)^{2} - 1} \right)}^{2}}}};$

wherein L_(d_adv), represents the third output value,

$\min\limits_{D}(\mspace{14mu})$

represents a value of D when the function ofE_(x˜q(x))[max(0,1−D(x))]E_(z˜p(z))[max(0,1+D(G(z)))] has a minimumvalue, D represents the discriminator, the value of D calculated basedon the above formula represents the third output value, E_(x˜q(x))(u)represents a mean value of a function u when x obeys q(x), q(x)represents a distribution of the standard image, x represents thestandard image, E_(z˜p(z))(u) represents a mean value of the function uwhen z obeys p(z), p(z) represents a distribution of the historicallyrestored image, z represents the to-be-processed image, D(x) representsthe second evaluation result, D(G(z)) represents the first evaluationresult, G(z) represents the historically restored image, λ represents apreset hyperparameter, ∇( ) represents a gradient penalty function, Îrepresents an interpolation function between the distribution of q(x)and the distribution of p(z), and ∇_(Î)D(Î) represents a gradientpenalty of the discriminator by the interpolation function Î between thedistribution of q(x) and the distribution of p(z).

In some embodiment, updating the weight of the convolutional kernel ofthe generator and updating the weight of the convolutional kernel of thediscriminator are performed several times.

In some embodiment, the discriminator includes a plurality of residualmodules and a self-attention module. The plurality of residual modulesare connected in series and configured to receive a feature map of ato-be-evaluated image or a feature map processed by an upper levelresidual module. The to-be-evaluated image is the image historicallyrestored by the generator or a standard image corresponding to thehistorically restored image, and each residual module includes one ormore convolutional layers. The self-attention module has an input endconnected with a residual module of the plurality of residual modules toreceive the feature map processed by the residual module. Theself-attention module is configured to extract global features of aninput feature map, and an output end of the self-attention module isconnected with another residual module of the plurality of residualmodules.

In some embodiment, one or more of the plurality of residual modulesinclude a channel attention module. The channel attention module isconfigured to perform a weighting process on channels of the inputfeature map, and the input feature map is a feature map processed by theone or more convolutional kernels of the residual module.

In some embodiment, the channel attention module includes: a globalaverage pooling unit configured to perform a global average pooling onthe input feature map; a linear correction unit configured to perform alinear correction on the feature map on which the global average poolinghas been performed; and an s activation function unit configured todetermine a weight of each channel according to the feature map on whichthe linear correction has been performed. Adjacent units are connectedthrough a full connection layer.

In some embodiment, the channel attention module further includes: afirst weighted summation unit configured to perform a weighted summationon the input feature map according to the weight of each channeldetermined by the s activation function unit.

In some embodiment, the input end of the self-attention module isconnected with an output end of a second residual module, and the outputend of the self-attention module is connected with an input end of athird residual module.

In some embodiment, the self-attention module includes: a query unit, akey unit, a value unit, a self-similarity calculation unit, anormalization processing unit, and a second weighted summation unit. Thequery unit is configured to perform convolution on the input processedfeature map by a query convolutional kernel to obtain a queryconvolution processing result. The key unit is configured to performconvolution on the input processed feature map by a key convolutionalkernel to obtain a key convolution processing result. The value unit isconfigured to perform convolution on the input processed feature map bya value convolutional kernel to obtain a value convolution processingresult. The self-similarity calculation unit is configured to calculatea self-similarity of the query convolution processing result and thevalue convolution processing result; a normalization processing unitconfigured to normalize the calculated self-similarity based on a presetregression function to obtain a normalized weight. The second weightedsummation unit is configured to perform weighted summation on the valueconvolution processing result according to the normalized weight toobtain a weighted summation result. The output of the self-attentionmodule is generated according to the weighted summation result.

In some embodiment, the generator processes the to-be-processed imagebased on partial convolution and/or LBAM.

In some embodiment, the biometric information includes fingerprint orpalm print information.

Another embodiment of the present disclosure provides an apparatus forimage restoration. The apparatus includes a first acquisition circuitryand a processing circuitry. The first acquisition circuitry isconfigured to: acquire a to-be-processed image including biometricinformation, and to input the to-be-processed image into a generator.The processing circuitry includes the generator, and the generator isconfigured to restore the to-be-processed image and includes a neuralnetwork model with a plurality of convolutional layers. A weight of aconvolutional kernel of the generator is determined at least accordingto a quality of an image historically restored by the generator.

Another embodiment of the present disclosure provides a non-transitorystorage medium having computer instructions stored therein, wherein thecomputer instructions are executed to perform steps of the methodaccording to embodiments of the present disclosure.

Another embodiment of the present disclosure provides a terminalincluding a memory and a processor, wherein the memory is stored withcomputer instructions executable on the processor, and the computerinstructions are executed by the processor to perform steps of themethod according to embodiments of the present disclosure.

Compared with conventional technologies, embodiments of the presentdisclosure have following beneficial effects.

According to an embodiment of the present disclosure, the imagerestoration method includes: acquiring a to-be-processed image includingbiometric information; inputting the to-be-processed image into agenerator, wherein the generator includes a neural network model with aplurality of convolutional layers, and a weight of a convolutionalkernel of the generator is determined at least according to a quality ofan image historically restored by the generator; and restoring theto-be-processed image by the generator to acquire a restored image.

Compared with existing solutions of restoring images based on machinelearning, embodiments of the present disclosure perform imagerestoration based on the neural network model, and the convolutionalkernel of the neural network model is adjusted according to a historicalrestoration results to optimize the quality of the restored image outputby the generator. Specifically, the process of adjusting theconvolutional kernel can at least be implemented in a model trainingstage according to the quality of the historically restored image of thegenerator, so as to obtain a generator that is more in line with actualneeds. Further, in actual application stage, the convolutional kernelcan be further adjusted based on the historically restored images. Withthe increase of historical data, amount of feedback data for adjustingthe convolutional kernel becomes larger and larger, making theadjustment of the convolutional kernel more accurate, which facilitatesto improve the restoration quality when performing image restorationbased on machine learning.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 schematically illustrates a flowchart of a method for imagerestoration according to an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flowchart of a method for adjusting aconvolutional kernel according to an embodiment of the presentdisclosure;

FIG. 3 illustrates a schematic structural diagram of a residual moduleaccording to an embodiment of the present disclosure;

FIG. 4 illustrates a schematic structural diagram of a channel attentionmodule in

FIG. 3;

FIG. 5 illustrates a schematic structural diagram of a self-attentionmodule according to an embodiment of the present disclosure; and

FIG. 6 illustrates a schematic diagram of an apparatus for imagerestoration according to an embodiment of the present disclosure.

DETAILED DESCRIPTION

As mentioned in the background, it is necessary to perform restorationon the collected images in later stage in existing fingerprintcollection solutions.

An intuitive solution is to image multiple times, complement each other,and combine multiple collected images to obtain a complete image.However, in practical applications, fingerprint unlocking time of amobile phone is very short, and multiple imaging scheme has very highrequirements on a processing speed of a hardware of the mobile phone,which increases system cost.

An embodiment of the present disclosure provides a method for imagerestoration. The method includes: acquiring a to-be-processed imageincluding biometric information; inputting the to-be-processed imageinto a generator, wherein the generator includes a neural network modelwith a plurality of convolutional layers, and a weight of aconvolutional kernel of the generator is determined at least accordingto a quality of an image historically restored by the generator, andrestoring the to-be-processed image by the generator to acquire arestored image.

Embodiments of the present disclosure perform image restoration based onthe neural network model, and the convolutional kernel of the neuralnetwork model is adjusted according to a historical restoration resultsto optimize the quality of the restored image output by the generator.Specifically, the process of adjusting the convolutional kernel can atleast be implemented in a model training stage according to the qualityof the historically restored image of the generator, so as to obtain agenerator that is more in line with actual needs. Further, in actualapplication stage, the convolutional kernel can be further adjustedbased on the historically restored images. With the increase ofhistorical data, amount of feedback data for adjusting the convolutionalkernel becomes larger and larger, making the adjustment of theconvolutional kernel more accurate, which facilitates to improve therestoration quality when performing image restoration based on machinelearning.

In order to make above objects, features and beneficial effects of thepresent disclosure more obvious and understandable, specific embodimentsof the present disclosure are described in detail in combination withthe drawings.

FIG. 1 is a flow chart of a method for image restoration according to anembodiment of the present disclosure.

Specifically, referring to FIG. 1, the method may include followingsteps:

S101, acquiring a to-be-processed image including biometric information.

S102, inputting the to-be-processed image into a generator, wherein thegenerator includes a neural network model with a plurality ofconvolutional layers, and a weight of a convolutional kernel of thegenerator is determined at least according to a quality of an imagehistorically restored by the generator.

S103, restoring the to-be-processed image by the generator to acquire arestored image.

In some embodiment, the biometric information may include fingerprint orpalm print information. Next, an image containing fingerprintinformation is taken as an example for specific description.

In some embodiment, in S101, the to-be-processed image may be collectedby a sensor. For example, the sensor can be integrated into anintelligent terminal such as a mobile phone and iPad. The method in thisembodiment can be executed by a processor of the intelligent terminal,or by a background server communicating with the intelligent terminal.The background server can be a cloud server.

In some embodiment, the generator may process the to-be-processed imagebased on partial convolution (PConv).

For example, the neural network model of the generator may be a U-netnetwork structure model. Specifically, for an input to-be-processedimage, the U-net network structure model first performs down-sampling onthe to-be-processed image by different degrees of convolutional kernel.This process may also be called encoding process to learn deep featuresof the image. Then, the feature of the image is restored by up-sampling.This process may be called decoding process.

In the up-sampling, both the feature from the up-sampling (i.e., thefeature of a decoder) and the feature from the down-sampling (i.e., thefeature of the encoder) are received. For example, the (i+1)thconvolutional layer may establish a connection channel (referred to as achannel) with the ith convolutional layer and the (n-i)th convolutionallayer.

Correspondingly, each convolutional layer starting from the second layercan receive data output by at least one upper convolutional layer.

For each convolutional layer, the convolutional layer convolves theinput to-be-processed image, and also convolves a mask corresponding tothe input to-be-processed image. The mask characterizes whether eachpixel of the to-be-processed image needs to be restored, for example, 1indicates no restoration is needed, and 0 indicates restoration isneeded.

For the generator using partial convolution, the mask and image areupdated every time data passes through one convolutional layer, wherethe data refers to image feature data (i.e., the feature) afterconvolution of a current convolutional layer. As the number of neuralnetwork layers increases, the number of pixels with a value of 0 in theoutput of a mask m′ becomes less and less, and an area of an effectiveregion in a corresponding restored image x′ becomes larger and larger,thus the impact of the mask on the overall loss will become smaller andsmaller.

Finally, a Tanh hyperbolic function can be used as a last activationfunction. A value of Tanh ranges from −1 to 1, and the Tanh hyperbolicfunction has a convergence faster than that of S activation function(sigmoid), and symmetrically distributed results.

For example, the restored image x′ and the updated mask m′ can beobtained based on the following formula:

$x^{\prime} = \left\{ {\begin{matrix}{{{{W^{T}\left( {X \odot M} \right)}\frac{{sum}(1)}{{sum}(M)}} + b},} & {{{ifsu}{m\ (M)}} > 0} \\{0,} & {otherwise}\end{matrix};{m^{\prime} = \left\{ {\begin{matrix}{1,} & {{{ifsu}{m\ (M)}} > 0} \\{0,} & {otherwise}\end{matrix};} \right.}} \right.$

wherein, W represents the convolutional kernel of the convolutionallayer, that is, the weights of filters of the convolutional layer, Trepresents a transposition of a matrix, X represents feature values ofthe input image, M represents the mask, and is a binary mask in thisembodiment, 0 represents a unit multiplication, that is, an element-wisedot multiplication operation, b represents a bias of the filters of theconvolutional layer, m′ represents an output of the input mask afterconvolution. For each convolutional layer, the restored image x′ outputby the convolutional layer is an output image after convolution by theconvolutional layer. Similarly, for each convolutional layer, theupdated mask m′ output by the convolutional layer is an output maskafter convolution by the convolutional layer.

Specifically, the convolutional kernel W may be used to determine thenumber of features to be extracted from the input image.

In some embodiment, the generator may process the to-be-processed imagebased on learnable bidirectional attention maps (LBAM, referred to as alearnable bidirectional mask).

For example, in the above-mentioned PConv-based U-net network structuremodel, the process of updating the mask may only occur in the encodingstage. In the decoding stage, all values of the mask are 1.

In some embodiment, the learnable bidirectional attention map can beintroduced into a LBAM model. Suppose that X is the input image and M isthe corresponding mask, 1 represents a pixel with valid fingerprintinformation, and 0 represents a pixel without valid fingerprintinformation.

In forward attention mask, M is used as an input mainly used to modifythe features during encoding. In the encoding stage, the value of themask is gradually updated, and the features are corrected with the maskduring the encoding.

In contrast, in the decoding stage, 1−M is used as the mask of the lastlayer to modify the features of the last layer in the decoder. Further,the previous layer mask in the decoder is gradually updated forward, andthe corresponding mask is used to modify the features of the previouslayer in the decoder.

In some embodiment, a bidirectional attention mask is adopted so thatthe decoder can pay more attention to how to restore areas withoutfingerprints. Therefore, by adopting the bidirectional attention mask,irregular areas without fingerprints can be better restored.

Specifically, in the encoding stage, down-sampling is performed based onfollowing formulas (1) to (3):

M _(C) ^(in) =g _(A)(W _(m) ^(T) M ^(in))  (1);

F ^(out)=(W _(f) ^(T) F ^(in))⊙M _(C) ^(in)  (2);

M ^(out) =g _(m)(M _(C) ^(in))  (3);

wherein, M^(in) represents the input mask of the current layer in theencoder, W_(m) ^(T) represents the convolutional kernel of thecorresponding updated mask M_(C) ^(in), function g_(A)( ) represents anasymmetric activation function with a similar shape to the Gaussianfunction, function g_(m) ( ) represents an activation function forupdating the mask, F^(out) represents the output feature of the currentlayer in the encoder, that is, the input feature of the next layer,F^(in) represents the input feature of the current layer in the encoder,W_(f) ^(T) represents a corresponding convolutional kernel; ⊙ representa dot multiplication operation, and M^(out) represents the output maskof the current layer in the encoder, that is, the input mask of the nextlayer.

The function g_(A)( ) may be indicated by following formula:

${g_{A}\left( M^{c} \right)} = \left\{ {\begin{matrix}{a\ {\exp\left( {- {\gamma_{l}\left( {M^{c} - \mu} \right)}^{2}} \right)}} \\{1 + {\left( {a - 1} \right){\exp\left( {- {\gamma_{r}\left( {M^{c} - \mu} \right)}^{2}} \right)}}}\end{matrix};} \right.$

wherein, a, μ, γ₁, and γ_(r) are constants. In some embodiment, a is1.1, μ is 2.0, γ₁ is 1.0, and γ_(r) is 1.0.

The function g_(m)( ) may be indicated by following formula:

g _(m)(M ^(c))=(ReLU(M ^(c)))^(α);

wherein, function ReLU(*) is a linear correction function, and a valueless than 0 is set to 0. In some embodiment, α is 0.8.

In the decoding stage, the (L-l)th layer in the decoder receives thefeature and the mask of the (l+1)th layer in the encoder, and alsoreceives the features and masks of the (L-l-1)th layer in the decoder.Thus, more attention can be paid to restoring the areas needed to berestored in the (L-l)th layer. Specifically, it can be expressed basedon formulas (4) and (5):

F _(d) ^(out)=(W _(e) ^(T) F _(e) ^(in))⊙g _(A)(M _(e) ^(c))+(W _(d)^(T) F _(d) ^(in))⊙g _(A)(M _(d) ^(c))  (4);

M _(d) =g _(m)(M _(d) ^(c))  (5),

wherein W_(e) ^(T) and W_(d) ^(T) represents corresponding convolutionalkernels, M_(e) ^(c) and F_(e) ^(in) respectively represents the mask andthe feature of the (l+1)th layer in the encoder, M_(d) ^(c) and F_(d)^(in) respectively represents the mask and the feature of the (L-l-1)thlayer in the decoder, and F_(d) ^(out) and M′_(d) respectivelyrepresents the mask and the feature output by the (L-l)th layer in thedecoder.

With above solution, both the mask in the encoding stage and the mask inthe decoding stage are considered in the decoding stage. Specifically,in the decoding stage, the mask is updated reversely, and the updatedmask is used for the previous layer in the decoder, and in the encodingphase, the updated mask is used for the next layer in the encoder. Inthis way, the model can pay more attention to the areas to be restored.

In some embodiment, after S103, the method also include following step:calculating L1 loss function and adversarial loss function of therestored image compared with the corresponding standard image.

The L1 loss function L1 loss of the restored image compared with thecorresponding standard image can be calculated based on followingformula:

L ₁ =∥G(z)−x∥ ₁;

wherein, L1 represents the L1 loss function L1_loss, G(z) represents therestored image output by the generator, z represents the to-be-processedimage, and x represents the standard image.

In some embodiment, in S103, the restored image output by the generatorcan be transmitted to a corresponding module of the intelligent terminalfor subsequent operation, such as fingerprint unlocking operation.

Then, the detailed process of adjusting the convolutional kernel W inthe above formulas based on historical data will be described in detail.Referring to FIG. 2, a flowchart of a method for adjusting theconvolutional kernel according to an embodiment of the presentdisclosure is shown.

Specifically, the weight of the convolutional kernel of the generatorcan be updated according to an evaluation result of a discriminator onthe image historically restored by the generator. The discriminatorincludes a neural network model with a plurality of convolutionallayers.

In some embodiment, there is a connection relationship among theplurality of convolutional layers of the neural network model of thegenerator and/or the plurality of convolutional layers of the neuralnetwork model of the discriminator.

Specifically, the weight of the convolutional kernel of the generatorcan be updated according to a evaluation result of a discriminator onthe image historically restored by the generator by following steps.

S201, acquiring the image historically restored by the generator.

S202, inputting the historically restored image into the discriminator.

S203, acquiring a first evaluation result of the discriminator on thehistorically restored image.

S204, updating the weight of the convolutional kernel of the generatorat least according to a loss function of the generator and the firstevaluation result.

In some embodiment, in S201, the generator can establish a connectionchannel with the discriminator to transmit the restored image.Specifically, the restored image output by the generator can be regardedas the historically restored image.

In some embodiment, the discriminator can be used to determine anadversarial loss function of the generator to judge the quality of theimage restored by the generator.

For example, the restored image finally output by the generator and thecorresponding standard image can be input into the discriminator, andthe discriminator can output an image difference degree between the twoimages. The image difference degree can be used to measure therestoration quality of the image. Specifically, the discriminator candistinguish the restored image from the corresponding standard image.When the discriminator cannot distinguish whether the image is therestored image or the corresponding standard image, the quality of therestored image is the best. In some embodiments, peak signal to noiseratio (PSNR) and structure similarity (SSIM) can be used to judge thequality of the restored image.

In some embodiments, in S202, the historically restored image of 256×256and the corresponding standard image are input into the first layer ofconvolutional layer in the discriminator with one channel.

Furthermore, for the plurality of convolutional layers in thediscriminator, the number of convolutional kernels of each convolutionallayer is gradually increased. That is to say, with a downwardtransmission of the image, the number of the convolutional kernels isincreased. A feature matrix is extracted from each layer, and the lastlayer calculates the image difference degree to give an evaluation valueof the discriminator. For example, the number of the convolutionalkernels doubles for each layer of downward transmission.

In some embodiment, the loss function of the generator may include anadversarial loss function and an L1 loss function.

In some embodiment, S204 may include following steps: calculating afirst output value according to the first evaluation result and theadversarial loss function of the generator, calculating a second outputvalue according to the historically restored image, a standard imagecorresponding to the historically restored image and the L1 lossfunction; and updating the weight of the convolutional kernel of thegenerator according to the first output value and the second outputvalue.

For example, the first output value can be calculated based on followingformula:

${L_{g\_ adv} = {\max\limits_{G}\left( {E_{z \sim {p{(z)}}}\left\lbrack {D\left( {G(z)} \right)} \right\rbrack} \right)}};$

wherein, L_(g_adv) represents the first output value

$\max\limits_{G}(\mspace{14mu})$

represents a value of G when the function of E_(z˜p(z))[D(G(z))] has amaximum value, G represents the generator, the value of G calculatedbased on the above formula is the first output value; E_(z˜p(z))(u)represents a mean value of a function u when z obeys P(z), p(z)represents a distribution of the historically restored image, zrepresents the to-be-processed image, D(G(z)) represents the firstevaluation result, and G(z) represents the historically restored image.

For example, the second output value can be calculated based onfollowing formula:

L ₁ =∥x−G(z)∥₁;

wherein, L1 represents the second output value (i.e., the L1 lossfunction), x represents the standard image, z represents theto-be-processed image, and G(z) represents the historically restoredimage.

Furthermore, the weight of the convolutional kernel of the generator canbe updated according to the sum of the first output value and the secondoutput value.

In some embodiments, the weight of the convolutional kernel of thediscriminator can also be updated according to an evaluation result ofthe discriminator on the historically restored image of the generatorand a standard image corresponding to the historically restored image.

Before/after/at the same time of S204, the method also includesfollowing steps: acquiring the image historically restored by thegenerator and the corresponding standard image; inputting thehistorically restored image into the discriminator to obtain a firstevaluation result, and inputting the standard image into thediscriminator to obtain a second evaluation result, and calculating athird output value at least according to an adversarial loss function ofthe discriminator, the first evaluation result and the second evaluationresult, and updating the weight of the convolutional kernel of thediscriminator according to the third output value.

For example, the third output value can be calculated based on followingformula:

${L_{d\_ adv} = {{\min\limits_{D}\left( {{E_{x \sim {q{(x)}}}\left\lbrack {\max\left( {0,{1 - {D(x)}}} \right)} \right\rbrack} + {E_{z \sim {p{(z)}}}\left\lbrack {\max\left( {0,{1 + {D\left( {G(z)} \right)}}} \right)} \right\rbrack}} \right)} + {\lambda{E_{\hat{I} \sim {p\hat{I}}}\left( {\left( {{\nabla_{\hat{I}}{D\left( \hat{I} \right)}}} \right)^{2} - 1} \right)}^{2}}}};$

wherein, L_(d_adv) represents the third output value,

$\min\limits_{D}(\mspace{14mu})$

represents a value of D when the function ofE_(x˜q(x))[max(0,1−D(x))]E_(z˜p(z))[max(0,1+D(G(z)))] has a minimumvalue, D represents the discriminator, the value of D calculated basedon the above formula represents the third output value, E_(x˜q(x))(u)represents a mean value of a function u when x obeys q(x), q(x)represents a distribution of the standard image, x represents thestandard image, E_(z˜p(z))(u) represents a mean value of the function uwhen z obeys p(z), p(z) represents a distribution of the historicallyrestored image, z represents the to-be-processed image, D(x) representsthe second evaluation result, D(G(z)) represents the first evaluationresult, G(z) represents the historically restored image, λ represents apreset hyperparameter, ∇( ) represents a gradient penalty function, Îrepresents an interpolation function between the distribution of q(x)and the distribution of p(z), and ∇_(Î)D(Î) represents a gradientpenalty of the discriminator by the interpolation function between thedistribution of q(x) and the distribution of p(z).

In some embodiment, the weight of the convolutional kernel of thediscriminator can be updated based on previous steps, and then S202 isexecuted to input the historically restored mage into the updateddiscriminator.

In some embodiment, the steps of updating the weight of theconvolutional kernel of the generator and updating the weight of theconvolutional kernel of the discriminator may be performed several timesto train the generator and the discriminator iteratively until thedifference between the first evaluation result and the second evaluationresult falls into a preset tolerance range.

In some embodiment, the discriminator may include a plurality ofresidual modules connected in series and configured to receive a featuremap of a to-be-evaluated image or a feature map processed by an upperlevel residual module. The to-be-evaluated image is the imagehistorically restored by the generator or a standard image correspondingto the historically restored image, and each residual module includesone or more convolutional layers.

Specifically, the residual modules may include a sampling residualmodule (resblock) and a down-sampling residual module (resblock down).

The residual modules can be used to extract the features of the inputimage, and the arrangement of the residual modules is conducive toeffectively avoiding gradient disappearance when iteratively updatingthe weight of the convolutional kernel to deepen the network.

For example, the plurality of residual modules connected in series mayinclude a plurality of cascade-connected down-sampling residual modulesand a sampling residual module, and an output of the sampling residualmodule is the output of the discriminator.

Alternatively, the output of the sampling residual module can be used asthe output of the discriminator through a series of processing, such asnormalization processing, and so on.

For example, a first down-sampling residual module of the discriminatorreceives the historically restored image of 256×256 or the correspondingstandard image, and the last down-sampling residual module outputs afeature of 4×4×512. The feature of 4×4×512 is output after passingthrough one sampling residual module, and then becomes a feature vectorof 1×1×512 through a global average pooling, and finally is output asthe output of the discriminator by a fully connection layer. With thissolution, the final output of the discriminator is one number.

Further, the discriminator also includes a self-attention module(Non-Local Block). An input end of the self-attention module isconnected with one of the plurality of residual modules to receive thefeature map output processed by the residual module. The self-attentionmodule is configured to extract global features of an input feature map,and an output end of the self-attention module is connected with anotherone of the plurality of residual modules.

For example, the input end of the self-attention module can be connectedwith an output end of a second residual module, and the output end ofthe self-attention module can be connected with an input end of a thirdresidual module. In other words, the self-attention module is disposedbehind the second down-sampling residual module. At this time, the widthand height of the feature map is 64, thus the requirement forcomputational complexity is moderate and global features can beextracted well.

FIG. 3 shows a structural diagram of one residual module.

Specifically, referring to FIG. 3, the residual module can successivelyinclude 4×4 convolutional layers, a normalization module (NormLayer), alinear correction unit (LeayRelu), 4×4 convolutional layers, anormalization module and a linear correction unit.

For example, the normalization module can be an instance normalizationmodule, which can accelerate model convergence and maintain theindependence between each image instance.

In the feature map after passing through the linear correction unit, apart greater than zero remains unchanged, and a part less than zero canbe multiplied by a preset constant to achieve the purpose of linearcorrection. The preset constant can be 0.2.

Further, the residual module includes a channel attention module. Thechannel attention module is configured to perform a weighting process onchannels of the input feature map, and the input feature map is afeature map processed by the one or more convolutional kernels of theresidual module. Thus, the setting of the channel attention module isbeneficial to improve the effect of image restoration.

For example, an input end of the channel attention module can beconnected with an output end of the second linear correction unit.

Furthermore, the residual module can also include a sum unit (markedwith “+” in the figure). The sum unit includes two inputs, one of whichis the output of the channel attention module, and the other of which isquickly connected to the initial input of the residual module. The sumunit adds up two inputs to obtain the output of the residual module.

In some embodiments, the residual module can include two convolutionallayers, which is conducive to paying attention to more features.

FIG. 4 is a structural diagram of the channel attention module in FIG.3.

Specifically, referring to FIG. 4, the channel attention module mayinclude a global average pooling unit, a linear correction unit and an sactivation function (sigmoid) unit. The global average pooling unit isconfigured to perform a global average pooling on the input feature map.The linear correction unit is configured to perform linear correction onthe feature map after the global average pooling. The s activationfunction unit is configured to determine a weight of each channelaccording to the feature map after the linear correction. Adjacent unitsare connected through a full connection layer.

Furthermore, the channel attention module also includes a first weightedsummation unit. The first weighted summation unit is configured toperform a weighted summation on the input feature map according to theweight of each channel determined by the s activation function unit. Forexample, one of two inputs of the first weighted summation unit is thefeature map that is initially input by the channel attention moduleobtained by a fast connection method, and the other of the two inputs ofthe first weighted summation unit is the weight of each channeldetermined by the s activation function unit. The first weightedsummation unit performs a weighted summation processing on the initialinput feature map based on the weight of each channel to obtain theoutput result.

FIG. 5 is a schematic structural diagram of the self-attention module.

Referring to FIG. 5, the self-attention module may include a query unit.The query unit is configured to perform convolution on the inputprocessed feature map by a query convolutional kernel to obtain a queryconvolution processing result. For example, a feature map of N×H×W×256output by the second residual module is input into the query unit.Assuming that the query convolutional kernel is 1×1×1, the queryconvolution processing result is N×H×W×32, where N represents the numberof input images, W represents the width of the feature map, and Hrepresents the height of the feature map.

The self-attention module may further include a key unit. The key unitis configured to perform convolution on the input processed feature mapby a key convolutional kernel to obtain a key convolution processingresult. For example, the feature map of N×H×W×256 output by the secondresidual module is input into the key unit. Assuming that the keyconvolutional kernel is 1×1×1, the key convolution processing result isN×H×W×32.

The self-attention module may further include a value unit. The valueunit is configured to perform convolution on the input processed featuremap by a value convolutional kernel to obtain a value convolutionprocessing result. For example, the feature map of N×H×W×256 output bythe second residual module is input into the value unit. Assuming thatthe value convolutional kernel is 1×1×1, the value convolutionprocessing result is N×H×W×128.

The self-attention module may further include a self-similaritycalculation unit. The self-similarity calculation unit is configured tocalculate a self-similarity degree of the query convolution processingresult and the value convolution processing result. For example, afterthe query convolution processing result is processed by max pooling, theoutput is N×1/2H×1/2W×32, which is input into the self-similaritycalculation unit with the key convolution processing result forcalculating the self-similarity degree.

The self-attention module may further include a normalization processingunit. The normalization processing unit is configured to normalize thecalculated self-similarity degree based on a preset regression functionto obtain a normalized weight. For example, the normalization processingunit may normalize the output of the self-similarity calculation unit byusing a softmax function.

The self-attention module may further include a second weightedsummation unit. The second weighted summation unit is configured toperform a weighted summation on the value convolution processing resultaccording to the normalized weight to obtain a weighted summationresult. For example, according to input parameters of the precedingexample, the weighted sum result may be a feature map of N×H×W×256.

Further, the weighted summation result output by the second weightedsummation unit is a feature map of N×H×W×256 after the convolution of1×1×1 convolutional layer.

Further, matrix addition is performed on the feature map of N×H×W×256output by the 1×1×1 convolutional layer and the feature map of N×H×W×256initially input by the self-attention module to obtain the final outputof the self-attention module, that is, the feature map of N×H×W×256.

Generally speaking, the higher the hierarchical position of theself-attention module in the discriminator, the higher the accuracy ofthe evaluation result and the more computation.

Embodiments of the present disclosure perform image restoration based onthe neural network model, and the convolutional kernel of the neuralnetwork model is adjusted according to a historical restoration resultsto optimize the quality of the restored image output by the generator.Specifically, the process of adjusting the convolutional kernel can atleast be implemented in a model training stage according to the qualityof the historically restored image of the generator, so as to obtain agenerator that is more in line with actual needs. Further, in actualapplication stage, the convolutional kernel can be further adjustedbased on the historically restored images. With the increase ofhistorical data, amount of feedback data for adjusting the convolutionalkernel becomes larger and larger, making the adjustment of theconvolutional kernel more accurate, which facilitates to improve therestoration quality when performing image restoration based on machinelearning.

FIG. 6 is a schematic diagram of an apparatus 6 for image restorationaccording to an embodiment of the present disclosure. It is understoodby those skilled in the art that the apparatus 6 can be used toimplement the method according to the embodiments as shown in FIGS. 1 to5.

Referring to FIG. 6, the apparatus 6 may include a first acquisitioncircuitry 61, a processing circuitry 62 and a second acquisitioncircuitry 63. The first acquisition circuitry 61 is configured toacquire a to-be-processed image including biometric information, and toinput the to-be-processed image into a generator. The processingcircuitry 62 includes the generator. The generator is configured torestore the to-be-processed image and includes a neural network modelwith a plurality of convolutional layer, and a weight of a convolutionalkernel of the generator is determined at least according to a quality ofan image historically restored by the generator. The second acquisitioncircuitry 63 is configured to acquire a restored image output by thegenerator.

More details on working principles and working methods of the apparatus6 may be referred to related descriptions with reference to FIGS. 1 to5, and are not described in detail here.

Furthermore, another embodiment of the present disclosure provides astorage medium. The storage medium has computer instructions storedtherein, and the computer instructions are executed to perform steps ofthe method according to the embodiments as shown in FIGS. 1 to 5. Insome embodiments, the storage medium may include a computer readablestorage medium, such as a non-volatile memory or a non-transitorymemory. The computer readable storage medium may include a Read OnlyMemory (ROM), a Random Access Memory (RAM), a magnetic disk or anoptical disk.

Furthermore, another embodiment of the present disclosure provides aterminal including a memory and a processor. The memory is stored withcomputer instructions executable on the processor, and the computerinstructions are executed by the processor to perform steps of methodaccording to the embodiments as shown in FIGS. 1 to 5. In someembodiment, the terminal may be an intelligent device, such as a mobilephone, an iPad, etc.

Although the present disclosure has been disclosed above, the presentdisclosure is not limited thereto. Any changes and modifications may bemade by those skilled in the art without departing from the spirit andscope of the present disclosure, and the scope of the present disclosureshould be determined by the appended claims.

1. A method for image restoration, comprising: acquiring ato-be-processed image, wherein the to-be-processed image comprisesbiometric information; inputting the to-be-processed image into agenerator, wherein the generator comprises a neural network model with aplurality of convolutional layers, and a weight of a convolutionalkernel of the generator is determined at least according to a quality ofan image historically restored by the generator; and restoring theto-be-processed image by the generator to acquire a restored image. 2.The method according to claim 1, wherein determining a weight of aconvolutional kernel of the generator at least according to a quality ofan image historically restored by the generator comprises: updating theweight of the convolutional kernel of the generator according to anevaluation result of a discriminator on the image historically restoredby the generator, wherein the discriminator comprises a neural networkmodel with a plurality of convolutional layers.
 3. The method accordingto claim 2, wherein there is a connection relationship among theplurality of convolutional layers of the neural network model of thegenerator and/or the plurality of convolutional layers of the neuralnetwork model of the discriminator.
 4. The method according to claim 2,wherein updating the weight of the convolutional kernel of the generatoraccording to an evaluation result of a discriminator on the imagehistorically restored by the generator comprises: acquiring the imagehistorically restored by the generator; inputting the historicallyrestored image into the discriminator; acquiring a first evaluationresult of the discriminator on the historically restored image; andupdating the weight of the convolutional kernel of the generator atleast according to a loss function of the generator and the firstevaluation result.
 5. The method according to claim 4, wherein the lossfunction of the generator comprises an adversarial loss function and anL1 loss function, and updating the weight of the convolutional kernel ofthe generator at least according to the loss function of the generatorand the first evaluation result comprises: calculating a first outputvalue according to the first evaluation result and the adversarial lossfunction of the generator; calculating a second output value accordingto the historically restored image, a standard image corresponding tothe historically restored image and the L1 loss function; and updatingthe weight of the convolutional kernel of the generator according to thefirst output value and the second output value.
 6. The method accordingto claim 5, wherein calculating a first output value according to thefirst evaluation result and the adversarial loss function of thegenerator comprises: calculating the first output value based onfollowing formula:${L_{g\_ adv} = {\max\limits_{G}\left( {E_{z \sim {p{(z)}}}\left\lbrack {D\left( {G(z)} \right)} \right\rbrack} \right)}};$wherein L_(g_adv) represents the first output value,$\max\limits_{G}(\mspace{14mu})$ represents a value of G when thefunction of E_(z˜p(z))[D(G(z))] has a maximum value, G represents thegenerator, the value of G calculated based on the above formula is thefirst output value; E_(z˜p(z))(u) represents a mean value of a functionu when z obeys P(z), p(z) represents a distribution of the historicallyrestored image, z represents the to-be-processed image, D(G(z))represents the first evaluation result, and G(z) represents thehistorically restored image.
 7. The method according to claim 5, whereincalculating a second output value according to the historically restoredimage, a standard image corresponding to the historically restored imageand the L1 loss function comprises: calculating the second output valuebased on following formula:L ₁ =∥x−G(z)∥₁; wherein L1 represents the second output value, xrepresents the standard image, z represents the to-be-processed image,and G(z) represents the historically restored image.
 8. The methodaccording to claim 2, further comprising: updating the weight of theconvolutional kernel of the discriminator according to an evaluationresult of the discriminator on the image historically restored by thegenerator and a standard image corresponding to the historicallyrestored image.
 9. The method according to claim 8, wherein updating theweight of the convolutional kernel of the discriminator according to anevaluation result of the discriminator on the historically restoredimage by the generator and a standard image corresponding to thehistorically restored image comprises: acquiring the image historicallyrestored by the generator and the corresponding standard image;inputting the historically restored image into the discriminator toobtain a first evaluation result, and inputting the standard image intothe discriminator to obtain a second evaluation result; and calculatinga third output value at least according to an adversarial loss functionof the discriminator, the first evaluation result and the secondevaluation result; and updating the weight of the convolutional kernelof the discriminator according to the third output value.
 10. The methodaccording to claim 9, wherein calculating a third output value at leastaccording to an adversarial loss function of the discriminator, thefirst evaluation result and the second evaluation result comprises:calculating the third output value based on following formula:${L_{d\_ adv} = {{\min\limits_{D}\left( {{E_{x \sim {q{(x)}}}\left\lbrack {\max\left( {0,{1 - {D(x)}}} \right)} \right\rbrack} + {E_{z \sim {p{(z)}}}\left\lbrack {\max\left( {0,{1 + {D\left( {G(z)} \right)}}} \right)} \right\rbrack}} \right)} + {\lambda{E_{\hat{I} \sim {p\hat{I}}}\left( {\left( {{\nabla_{\hat{I}}{D\left( \hat{I} \right)}}} \right)^{2} - 1} \right)}^{2}}}};$wherein L_(d_adv) represents the third output value,$\min\limits_{D}(\mspace{14mu})$ represents a value of D when thefunction of E_(x˜q(x))[max(0,1−D(x))]E_(z˜p(z))[max(0,1+D(G(z)))] has aminimum value, D represents the discriminator, the value of D calculatedbased on the above formula represents the third output value,E_(x˜q(x))(u) represents a mean value of a function u when x obeys q(x),q(x) represents a distribution of the standard image, x represents thestandard image, E_(z˜p(z))(u) represents a mean value of the function uwhen z obeys p(z), p(z) represents a distribution of the historicallyrestored image, z represents the to-be-processed image, D(x) representsthe second evaluation result, D(G(z)) represents the first evaluationresult, G(z) represents the historically restored image, λ represents apreset hyperparameter, ∇( ) represents a gradient penalty function, Îrepresents an interpolation function between the distribution of q(x)and the distribution of p(z), and ∇_(Î)D(Î) represents a gradientpenalty of the discriminator by the interpolation function Î between thedistribution of q(x) and the distribution of p(z).
 11. The methodaccording to claim 8, wherein updating the weight of the convolutionalkernel of the generator and updating the weight of the convolutionalkernel of the discriminator are performed several times.
 12. The methodaccording to claim 2, wherein the discriminator comprises: a pluralityof residual modules connected in series and configured to receive afeature map of a to-be-evaluated image or a feature map processed by anupper level residual module, wherein the to-be-evaluated image is theimage historically restored by the generator or a standard imagecorresponding to the historically restored image, and each residualmodule comprises one or more convolutional layers; and a self-attentionmodule having an input end connected with a residual module of theplurality of residual modules to receive the feature map processed bythe residual module, wherein the self-attention module is configured toextract global features of an input feature map, and an output end ofthe self-attention module is connected with another residual module ofthe plurality of residual modules.
 13. The method according to claim 12,wherein one or more of the plurality of residual modules comprise: achannel attention module configured to perform a weighting process onchannels of the input feature map, and the input feature map is afeature map processed by the one or more convolutional kernels of theresidual module.
 14. The method according to claim 13, wherein thechannel attention module comprises: a global average pooling unitconfigured to perform a global average pooling on the input feature map;a linear correction unit configured to perform a linear correction onthe feature map on which the global average pooling has been performed;and an s activation function unit configured to determine a weight ofeach channel according to the feature map on which the linear correctionhas been performed; wherein adjacent units are connected through a fullconnection layer.
 15. The method according to claim 14, wherein thechannel attention module further comprises: a first weighted summationunit configured to perform a weighted summation on the input feature mapaccording to the weight of each channel determined by the s activationfunction unit.
 16. The method according to claim 12, wherein theself-attention module comprises: a query unit configured to performconvolution on the input processed feature map by a query convolutionalkernel to obtain a query convolution processing result; a key unitconfigured to perform convolution on the input processed feature map bya key convolutional kernel to obtain a key convolution processingresult; a value unit configured to perform convolution on the inputprocessed feature map by a value convolutional kernel to obtain a valueconvolution processing result; a self-similarity calculation unitconfigured to calculate a self-similarity degree of the queryconvolution processing result and the value convolution processingresult; a normalization processing unit configured to normalize thecalculated self-similarity based on a preset regression function toobtain a normalized weight; and a second weighted summation unitconfigured to perform weighted summation on the value convolutionprocessing result according to the normalized weight to obtain aweighted summation result, wherein the output of the self-attentionmodule is generated according to the weighted summation result.
 17. Themethod according to claim 1, wherein the generator processes theto-be-processed image based on partial convolution and/or LBAM.
 18. Anapparatus for image restoration, comprising: a first acquisitioncircuitry configured to: acquire a to-be-processed image, wherein theto-be-processed image comprises biometric information, and to input theto-be-processed image into a generator; and a processing circuitrycomprising the generator, wherein the generator is configured to restorethe to-be-processed image and comprises a neural network model with aplurality of convolutional layers, and a weight of a convolutionalkernel of the generator is determined at least according to a quality ofan image historically restored by the generator.
 19. A non-transitorystorage medium having computer instructions stored therein, wherein thecomputer instructions are executed to perform steps of the methodaccording to claim
 1. 20. A terminal comprising a memory and aprocessor, wherein the memory is stored with computer instructionsexecutable on the processor, and the computer instructions are executedby the processor to perform steps of the method according to claim 1.